Previous Next Tracker index See it online !

(88/314) 2064591 - XML plugin auto closes incorrect tag

jEdit (r13380), XML v2.0.9

With the attached file, when I try to close the '<head>' tag by typing '</', it closes the '<script>' tag on Line 7 instead. It's reading the '<script>' tags inside of the strings, which is incorrect.

Submitted	hunteke - 2008-08-21 - 11:36:41z	Assigned	kerik-sf
Priority	5	Category	None
Status	Open	Group	None
Resolution	None	Visibility	No

Comments

2008-09-02 - 15:10:57z kpouer	Logged In: YES user_id=285591 Originator: NO Hi, that's nice, the problem is that you html document is not a valid xml document You could put your javascript code in xml comments, I think it will still work, and will also fix the problem. I don't know if it is possible to fix that in the XML plugin, but wait for Alan's answer
2008-09-02 - 17:58:04z hunteke	Logged In: YES user_id=1271235 Originator: YES Alright, I'll buy that it's not a valid XML document. However, I'll also point out that it's been useful to me in an HTML role as well. As, for instance, auto closing tags, showing me matching tags, and enabling folds. To be fair, I'll point out that the engine at http://validator.w3.org/ has a similar problem. I will attach the file in two parts, for playing purposes. A base correct HTML document (but not XML), and a udiff to make it not work. File Added: html_compliant.html
2008-09-02 - 18:23:27z hunteke	Logged In: YES user_id=1271235 Originator: YES $ patch -o html_borked.html < html_diff.diff After applying this patch to the previous attachment (html_compliant.html), note that '</script>' tag on line 12 incorrectly matches a string on line 10. File Added: html_diff.diff
2010-05-16 - 17:56:35z ezust	I see everything works fine with the "html-compliant.html" test file, as long as I am parsing using the HTML parser. However, if I add the code that is indicated in the diff, then it seems the contents of the <script> </script> confuses the XML parser and makes it think there are a bunch of opened script tags that need to be closed. This might be a bug in the HTML sidekick parser part of the XML plugin.
2010-05-17 - 18:42:40z daleanson	Actually, the html sidekick delegates a lot to the xml sidekick. The problem seems to be in the XML sidekick code, specificially in xml.parser.TagParser.findStartTag(). This method just works backwards through the code until it finds a matching start tag and takes no account of the sidekick Asset, which gives the actual start and end of a tag. I'll attach an example xml file that demonstrates the problem. Open the file in jEdit, open Sidekick and be sure you're parsing it as xml. You'll see the same problem as with the html file.
2010-05-18 - 07:30:06z hunteke	I apologize for forgetting to get back on this bug. I'm having a devil of a time tracking down the report now, or even a reference, but I have a recollection that my original assumption was incorrect. I recall pointing out this "error" to the W3 folks (or someone), and was informed that my formulation was actually incorrect. XML parsers correctly ignore the semantics that we humans assign to quotes. Thus, to do this type of document morphing, one needs to encode the left chevron ('<'). See attached file for one possible solution. (For completeness, I'll point out that I believe the style of building HTML fragments piecemeal via string operations like I was apparently doing in 2008 is frowned upon.) The point is that the XML or HTML plugin is actually performing the correct action. Caveat emptor: as I haven't thought about this is over 18 months, I'm not sure that my recollection or above analysis is correct.
2010-05-18 - 07:39:27z hunteke	Of course, now I look at your upload, Dale. Your example certainly /seems/ intuitive and explicit to me, but I'm afraid I don't know XML well enough to know for sure. I have a hunch however, that the XML snippet is in fact, not correct. My gut says you need to encode not just the left chevron, but the right one as well. Suggested differential attached.
2010-05-18 - 17:09:11z kerik-sf	Dale: your exemple indeed doesn't parse as XML. I don't agree about using the sidekick asset instead of TagParser, since the sidekick information is outdated most of the time, (even if you use parse on key-stroke, if there is an error before the point where you are editing). So I've found TagParser quite handy, since it uses only local context. This works most of the time, but not for <![CDATA[ some <tag> ]]> and not for the HTML script element. Ignoring CDATA sections for now, as soon as you leave the <head> section, every element auto-closes correctly (OK, if you type "</" after </body>, you'll get back the </script>.
2010-05-18 - 17:30:42z daleanson	Sorry to muddy the situation -- I wasn't saying the file I attached would parse correctly as XML since it won't. I was just trying to point out where the problem is. The problem is that the html sidekick delegates to the xml sidekick for tag matching. Often, html is not valid xml, which means the xml sidekick does not necessarily find the matching tag, as the original example and my example show. Probably the right fix in this case is to have the html sidekick do its own tag matching.

Attachments

2008-09-02 - 17:58:04z hunteke	html_compliant.html HTML compliant, and correctly recognized by jEdit/XML plugin
2008-09-02 - 18:23:26z hunteke	html_diff.diff Diff against html_compliant.html to break XML plugin
2010-05-17 - 18:43:41z daleanson	xml_sidekick_matching_example.xml xml file that causes the same problem
2010-05-18 - 07:32:49z hunteke	proper_method.html Example of encoding the left chevron before use.
2010-05-18 - 07:40:23z hunteke	diff_to_fix.diff Encoding of chevrons to fix example XML file